Forward masking on a generalized logarithmic scale for robust speech recognition
نویسندگان
چکیده
This paper examines the forward masking on the generalized logarithmic scale for robust speech recognition to both additive and convolutional noise. The forward masking in the dynamic cepstral (DyC) representation is based upon subtraction of a masking pattern from a current spectrum on a logarithmic spectral domain, whereas the proposed method intends to make a compromise between the logarithmic and linear spectral domains by choosing an appropriate value of the power. This technique is incorporated into a modi ed MFCC-based frontend. The connected-digit recognition tests showed that in noisy conditions this technique outperforms the conventional techniques such as the DyC, the continuous spectral subtraction method, the cepstral mean subtraction while maintaining the robustness to the convolutional noise.
منابع مشابه
Improved Forward Masking on a Generalized Logarithmic Scale for Robust Speech Recognition
We previously proposed a forward masking on a generalized logarithmic scale to eliminate convolutional noise as well as to suppress additive noise. While the generalized Dynamic Cepstrum derived from the masked spectrum has been robust to both noises, the robustness to convolutional noise slightly degrades as compared to masking on the logarithmic scale, and the optimal masking coefficient depe...
متن کاملEvaluation of a generalized dynamic cepstrum in distant speech recognition
This paper examines the effectiveness of a generalized dynamic cepstrum in distant speech recognition. The generalized dynamic cepstrum (DyMFGC) is based upon the forward masking on the generalized logarithmic spectrum instead of the log-spectrum, which intends to make it robust to additive noise as well as convolutional noise. Digit recognition tests were carried out in a relatively quiet and ...
متن کاملA model of dynamic auditory perception and its application to robust word recognition
This paper describes two mechanisms that augment the common automatic speech recognition (ASR) front end and provide adaptation and isolation of local spectral peaks. A dynamic model consisting of a linear filterbank with a novel additive logarithmic adaptation stage after each filter output is proposed. An extensive series of perceptual forward masking experiments, together with previously rep...
متن کاملAn auditory feature extraction method based on forward-masking and its application in robust speaker identification and speech recognition
1 This work is supported by National Nature Science Funds of China, the project number i Abstract: This article presents a new auditory feature extraction method, which considers the forwardmasking mechanism of auditory nerves and feasible in practice. Two features based on this method are extracted: FMFRC (forward masking firing-rate cepstrum) and FMSRC (forward masking synchronized rate cepst...
متن کاملAn Information-Theoretic Discussion of Convolutional Bottleneck Features for Robust Speech Recognition
Convolutional Neural Networks (CNNs) have been shown their performance in speech recognition systems for extracting features, and also acoustic modeling. In addition, CNNs have been used for robust speech recognition and competitive results have been reported. Convolutive Bottleneck Network (CBN) is a kind of CNNs which has a bottleneck layer among its fully connected layers. The bottleneck fea...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
عنوان ژورنال:
دوره شماره
صفحات -
تاریخ انتشار 2000